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Abstract 

Background: The goal of personalized medicine is to provide patients optimal drug screening and treatment 
based on individual genomic or proteomic profiles. Reverse-Phase Protein Array (RPPA) technology offers 
proteomic information of cancer patients which may be directly related to drug sensitivity. For cancer patients with 
different drug sensitivity, the proteomic profiling reveals important pathophysiologic information which can be 
used to predict chemotherapy responses. 

Results: The goal of this paper is to present a framework for personalized medicine using both RPPA and drug 
sensitivity (drug resistance or intolerance). In the proposed personalized medicine system, the prediction of drug 
sensitivity is obtained by a proposed augmented naive Bayesian classifier (ANBC) whose edges between attributes 
are augmented in the network structure of naive Bayesian classifier. For discriminative structure learning of ANBC, 
local classification rate (LCR) is used to score augmented edges, and greedy search algorithm is used to find the 
discriminative structure that maximizes classification rate (CR). Once a classifier is trained by RPPA and drug 
sensitivity using cancer patient samples, the classifier is able to predict the drug sensitivity given RPPA information 
from a patient. 

Conclusion: In this paper we proposed a framework for personalized medicine where a patient is profiled by RPPA 
and drug sensitivity is predicted by ANBC and LCR. Experimental results with lung cancer data demonstrate that 
RPPA can be used to profile patients for drug sensitivity prediction by Bayesian network classifier, and the 
proposed ANBC for personalized cancer medicine achieves better prediction accuracy than naive Bayes classifier in 
small sample size data on average and outperforms other the state-of-the-art classifier methods in terms of 
classification accuracy. 



Background 

In this paper, we present a framework for personalized 
cancer medicine with RPPA and drug sensitivity. The 
goal of personalized medicine is to provide optimal drug 
treatment based on individual's drug sensitivity level, 
which will save unnecessary cost and treatment. To 
achieve this, it is assumed that drug sensitivity can be 
predicted by using quantitative patterns of protein 
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expression which represents molecular characteristics of 
individual patients [1,2]. More precisely, as medicinal 
effect is closely relevant to cancer signaling transduction 
pathways, proteomic profiling can provide important 
pathophysiologic cues regarding responses to che- 
motherapies [3,4]. 

Figure 1 shows the process flow of the proposed fra- 
mework for personalized cancer medicine. In step (1), a 
classifier is trained using RPPA and drug sensitivity 
data. A single classifier is generated per each drug 
which means the number of classifiers is same as the 
number of drugs. In step (2), RPPA of a patient's sample 
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Figure 1 Overview of the personalized medicine. In step I, each classifier is trained by RPPA and sensitivity of corresponding drug. In step 2 
and 3, patient's RPPA is tested in each classifier, and the sensitivity of each drug is predicted. As a final step, only the drugs predicted to have 
low sensitivity are recommended to the patient. 



is provided as a test data, then in step (3), the classifier 
predicts High or Low as a drug sensitivity of the given 
test sample (Different discrete levels of sensitivity are 
available such as High/Neutral/Low). Based on the result 
of the prediction, the classifier can recommend a set of 
drugs that is more likely to have Low sensitivity. 

The prerequisite work of the proposed personalized 
medicine is the proteomic profiling of patients who have 
Different drug sensitivity level. The proteomic profiling 
is implemented by measuring the expression level of 
selected proteins which could be related to signaling 
pathways of the target cancer. To quantitatively measure 
the systemic responses of proteins in pathways, RPPA is 
used in conjunction with the quantum dots (Qdot) 
nano-technology. RPPA originally introduced in [5] is 
designed for quantitatively profiling protein expression 
levels in a large number of biological samples [6]. In 
RPPA, sample lysates are immobilized in series of dilu- 
tions to generate dilution curves for quantitative mea- 
surements being able to use only small amount 
(nanoliter) of sample while other protein arrays immobi- 
lize antibodies. After primary and secondary antibodies 
are probed, signal is detected by Qdot assays. Qdot is a 
nano-metal fluorophore with more bright and linear sig- 
nal, and also Qdot prevents photo-bleaching effect that 
often occurs in organic fluorophores [7,8]. In addition, 
RPPA offers more accurate pathophysiologic informa- 
tion in a signaling pathway with posttranslational modi- 
fications (e.g. phosphorylation) not obtainable by gene 
microarray and protein-protein interactions. 

For the classification in personalized medicine system, 
we employ a probabilistic approach, Bayesian Network 
Classifier where the class label (drug sensitivity) is pre- 
dicted with its probability so that we can select only 
drugs that are predicted to have high probability of low 
sensitivity rather than any drugs that are predicted to 
have low sensitivity without considering the probability. 
Naive Bayes Classifiers (NBC) [9] (Figure 2(A)) competi- 
tively works with state-of-the art classifiers in many 
complex real-world applications. Basically NBC assumes 



that all random variables (attributes) are conditionally 
independent to each other given a class variable. This 
assumption, however, is not realistic especially in biolo- 
gical domain because the interactive dependencies 
between cancer-related proteins in signaling pathways 
may exist. To overcome this limitation of NBC, how to 
involve the relationship between attributes for improv- 
ing the classification performance has been the issue of 
Bayesian network classifier study during the past years. 
In [10], Friedman et al. proposed a Tree-Augmented 
Naive Bayesian classifier (TAN) by adding edges into 
the structure of NBC. Augmented edges in TAN are 
restricted to tree structure and learning structure algo- 
rithm is based on the conditional mutual information 
between two variables given a class variable. In this 
paper, we focus on augmented naive Bayes classifier 
(ANBC) where each attribute can have at least class 
variable as a parent and at most two parents and the 
structure of augmented edges is not necessary to be 
tree. To find discriminative structure, we propose a new 
method based on local classification rate (LCR) to score 
augmented edges and greedy search algorithm to find 
the ANBC structure that has the highest classification 
rate. In the experiments, the proposed ANBC for perso- 
nalized medicine is compared to state-of-the-art classi- 
fiers including NBC and TAN in lung cancer data. 

The paper is organized as follows. In the methods sec- 
tion, the basic concept of Bayesian network and Baye- 
sian network classifier are reviewed, and we give a 
detailed account of the proposed ANBC. In the results 
section, we present the experimental result comparing 
to other classification algorithms. Finally, we conclude 
with summary and future work in the conclusion 
section. 

Method 

Bayesian networks 

A Bayesian network is a directed acyclic graph that 
encodes a joint probability distribution over a set of ran- 
dom variables X = {Xi,..., X n } (Variable, attribute, and 
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Figure 2 An example of the Bayesian network structure for NBC and augmented NBC (ANBC) (A) In NBC, all the attributes are 
conditionally independent given the class variable. (B) In ANBC, each attribute have at most one other attribute as an additional parent but 
augmented edges of ANBC are not necessary to constitute the tree structure which means that any attribute can have only class variable as a 
single parent. 



feature are interchangeably used). In this paper, we 
assume that all variables are discrete. A Bayesian net- 
work is defined by a pair B = (G, 0). The first compo- 
nent G is a network structure where each node 
represents a variable in X. If there is a directed edge 
from variable Xj to X 2 - {Xj — > X/), Xj is a parent of X,-. For 
each variable X„ a set of parent variables is denoted by 
rixj) and X t takes the state x^ that is the kth state of 
Xn, —,Xir j where r, is the number of possible states of X,-. 
The second component 0 is a set of parameters for 
local conditional probability distributions representing 
the probability of a state of the variable given states of 
its parents. A parameter is defined as 



P B (Xi = x ik \n Xi = JTij) 



?ijk 



(1) 



where Tt%] € 



{jTn, . . . , Jtty} is the jth parent configura- 
tion (the states of parents) of Wx, and qi is the number 
of possible parent configuration given Ylxf The para- 
meter 9ij k denotes the probability that the state of X, is 
Xik given itq as the state of ELc, • A structure of Bayesian 
network defines a unique joint probability distribution 
over X given by the product of local distributions as 



Pb(Xi 



J£n) = IlPli(W 



(2) 



Bayesian networks classifier 

Bayesian Network Classifier (BNC) is a probabilistic 
classifier based on Bayes' theorem. A set of random vari- 
ables is defined as X = {Xi,..., X H _i, C} where nth variable 
is a class variable. Bayesian network classifier predicts 
the label c that maximizes the posterior probability Pb(C 
= c\Xi = X\,..., X n _i = x n -\) given a Bayesian network 
structure (Figure 2) and an instance {xi,..., x n _i] of 
attributes. 



Naive Bayes classifier 

In Naive Bayes Classifier (NBC), the posterior probabil- 
ity is defined as 



p(C|Xi, x„_o 



ip(C) npWQ 

^ 1=1 



(3) 



where p(C) fl"^ 1 p(X, |C) (priorxlikelihood) is same as 
joint probability in (2) since it is assumed that each vari- 
able X, is conditionally independent of every other vari- 
able Xj for i * j given class variable C as a parent of X, 
(Figure 2(A)); we can cancel the constant Z since the 
evidence Z, p{X 1 ,..., X n _i), is independent to C in maxi- 
mizing the posterior. Hence, the classifier is defined as 
argmax ce cp{C = c)Y\ H i=i p{X\ = Xi\C = c) given a test 
instance {x lt ..., x„_i}. In our application, discrete class 
variable C = {High, Low} indicates a drug sensitivity 
level, and an attribute X, refers to a discretized protein 
expression level in RPPA. So, in NBC, it is assumed that 
each protein is conditionally independent to other pro- 
tein and dependent to only the drug sensitivity. How- 
ever, this assumption is unrealistic since the selected 
proteins of RPPA could have the biological interactions 
in the signaling pathway affecting the efficacy of the 
drug. 

To calculate the likelihood in the classifier, firstly the 
maximum likelihood (ML) parameters that maximize log 
likelihood (LL) can be obtained by frequency estimation 
with training data in the form 



Qyk = 



Nij 



(4) 



where N^ denotes the number of instances in training 
data where X t = x^ and FLx, = ny, and Ny = J2k=i Nyk-. 
After the parameters are estimated, then these para- 
meters & = n),je{i n) are used to 
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compute the likelihood p{Xi\C) of the classifier given a 
test instance and a class label. In addition, the logarithm 
of likelihood {Y\o%p{Xi\ C)) is practically taken to avoid 
numerical underflow in the implementation instead of 
products of all likelihoods, Ylp(Xj\C). 
Augmented naive Bayes classifier 

To solve the limitation of NBC, Friedman et al. [10] 
introduced TAN classifier where edges are added in the 
structure of NBC. These additional edges are called aug- 
mented edge. The idea is that if a strong dependency 
between X l and X 2 exists, the directed edge is added 
between X 1 and X 2 (Figure 2(B)). The maximum num- 
ber of edges added to relax the independent assumption 
between variables is n - 1, but the augmented edges of 
TAN are limited to construct tree-like Bayesian net- 
work. Instead, We are focusing on augmented naive 
Bayes classifier (ANBC) where an attribute X t have at 
least the class variable as a parent and at most two par- 
ents, the class variable and another attribute Xj, and the 
class variable has no parent. More precisely, the aug- 
mented edges of TAN are restricted to tree structure 
but the augmented edges of ANBC are not necessary to 
be tree structure (i.e. Some node may not have an aug- 
mented edge in ANBC). Once the structure is con- 
structed and the parameters are estimated with training 
data, we can classify an instance into a class label that 
maximizes the posterior given by 

p{c)Y\ P (x t \nf;, c) (5) 

where Tl£ denotes the parent set of variable X, 
except the class variable C. 
Discriminative structure learning 

We focus on discriminative structure learning for ANBC 
since it is shown that a good discriminative structure is 
sufficient to generate good discriminative classifier in 
the comparative research [11]. Indeed, BNC with discri- 
minative structures and generative parameters outper- 
forms BNC with not only discriminative structures and 
discriminative parameters but also generative structures 
and either discriminative or generative parameters in 
their experimental results. In [11,12], the classification 
rate (CR) is used to score how a given structure is dis- 
criminative. The CR is defined as 

CR = 7^7 E 1 ( BN C « <_0 , c m ) , (6) 

l^l m=l 

where |5| is the number of instances in training data 
S. BNC{x lt ..., #„_i) is an Bayesian network classifier, arg- 
max ce cp(C\X 1 ,...,X„, 1 ), given a Bayesian network struc- 
ture. I(c m , c m ) is an indicator function for c m = c m 
where c m is the class label predicted by 
BNC{xf, x™_j)and c m is the correct class label (the 



state of the class variable C of the instance). To esti- 
mate CR of a given structure, BNC is trained and tested 
on the training data S by using leave-one-out. In [11], 
they use the greedy method, hill climbing search, to find 
the structure that has local optimum CR in updating 
(adding or deleting augmented edge) the structure itera- 
tively. However, CR based scoring and searching 
approach is computationally expensive than other 
method due to the exponential searching space ((n-1)"' 
2 ) as training and testing of updated structure is 
repeated in every iterations. In order to improve CR 
based approach, we propose a new algorithm in which 
the basic idea is to reduce the search space by excluding 
unnecessary edges. Each edge between attributes is eval- 
uated by a modified CR. We call the proposed score 
function Local Classification Rate (LCR) as the score 
measures how each augmented edge is likely to contri- 
bute the increase of classification rate when only the 
edge is added in NBC. LCR is defined as 

ICR, = -i- £ /(ANBC,j(x™ <.,), c»)-r(NBC(xr c"). (7) 

1^1 m=l 

where ANBdj is a ANBC where the single directed 
edge from /' to i (£, ; ) is augmented in the structure of 
NBC. More precisely, ANBC^ , x^Jis defined 
as argmaxic p(X t = x\Xj = x p C = c)U h , h * i p{X h = x h \ 
C = c). As the second term is CR of NBC, it is constant 
with respect to i and /'. LCR t j >0 indicates that the edge 
Eij could increase the classification rate of ANBC when 
E t j is augmented in the structure of NBC. For ANBC, 
the number of all possible augmented edges are (n - 1) 
(« - 2). After we calculate LCR for all possible augmen- 
ted edges, the edges that have negative LCR are 
excluded from structure searching space. To decrease 
more the number of available augmented edges, we 
select the edge £; y only if LCRij is equal to the max 
LCRih for h e X '. Because variable X, can have only a 
single Xj as a parent except class variable, only the vari- 
able that maximizes LCR x n \c j s se l ec ted as the parent 
of Xi. In searching step, the structure is iteratively 
updated by randomly adding or deleting an augmented 
edge maintaining the acyclic property and the limited 
number of parents per attribute (Each attribute can 
have at most two parents including class variable). 

Experiments 

Lung cancer data 

In this section, lung cancer data is used to gauge the 
performance of proposed personalized medicine system 
with a new score function LCR for learning discriminate 
structure of ANBC. RPPA for lung cancer consists of 55 
antibodies (Table 1), 75 cell lines. There are 24 drugs to 
measure the drug sensitivity of each cell lines but a drug 
is not tested in all cell lines which mean each drug has 
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Table 1 55 antibodies of used in RPPA 



pSrc(Y527) 


p53 


ERK 


pERK 


GSK3 


pGSK3 


CyclinBI 


pRb 


pIRSI (Y1 179) 


p38 


pp38 


PTEN 


NQ01 


Stat3 


pNF- 
kBp65 


pStat3 


plRS1(Y896) 


p16 


pJNK 


pPTEN 


CDK4 


pAKT 


CyclinD3 


EGFR 


pIGFI R(Y1 1 58- 
1162) 


Src 


RAF1 


pRAF1 


Bcl2 


JNK 


b- 

Catenin 


b-Actin 


pIGFI R(Y1 1 62- 
1163) 


p27 


pp53 


Hsp27 


IKBa 


pIKBa 


Vimentin 


pMDM2 


pEGFR(Y1 173) 


p21 


sClu 


IGF1R 


MDM2 


IRS1 


pSrc 
(Y416) 


gH2AX 


E-Cadherin 


Rb 


AKT 


pBcl2 


mTOR 


pmTOR 


NF- 
kBp65 





Prefix p indicates phosphorylation. 



tested in Different set of cell lines. The sensitivity of 
each drug is measured with 43 cell lines on average. As 
a preprocessing, the drug sensitivity is discretized into 2 
states (High or Low) by K-means clustering algorithm in 
which the maximum and minimum values of drug sensi- 
tivity are used for initial centroid. The protein expres- 
sion level of RPPA is discretized by minimum entropy 
based discretization method [13]. 

Experimental setup 

We conducted the comparative evaluations with the fol- 
lowing classification algorithms: Support Vector 



Machine with three Different kernels, Linear kernel 
(SVML), Polynomial kernel (SVMP), and Radial basis 
function kernel (SVMR), Logistic Regression (LR), Ran- 
dom Forest (RF), Tree-Augmented Naive Bayes (TAN) 
[10], NBC, and ANBC we proposed. To evaluate the 
performance of Different methods, we measure the pre- 
diction accuracy on average using leave-one-out estima- 
tion Since the structure is randomly updated in 
searching, 5 times leave-one-out are performed in 
ANBC. The original continuous values of RPPA are 
used in SVM, LR, and RF. For the parameter estimation, 
only maximum likelihood parameters are used for NBC, 
TAN, and ANBC since we only compare the structure 
leaning methods rather than discriminative parameter 
learning methods. To avoid zero conditional probability 
in logarithm of likelihood when we calculate the joint 
probability, we set % = N' ijk = 0.5, N' {j = 1 if 

N ijk = 0 or N t j = 0. Accuracy is 'calculated by a ratio of 
the number of correct predictions to the total number 
of samples in leave-one-out estimation. In addition, for 
reasonable comparison, feature selection is applied for 
all classification methods because some of methods may 
not produce a good result in high dimension data and 
also all 55 proteins may be not related to drug sensitiv- 
ity directly. For SVM, LR, and RF, attributes are selected 
by using Information Gain [14] and Ranker 



Table 2 Accuracy of sensitivity prediction for 24 drugs with 20 selected features 



Drug Name 


SVML 


SVMP 


SVMR 


LR 


RF 


NBC 


TAN 


ANBC 


8-aminoadenosine 


68.89 


68.89 


68.89 


71.11 


55.56 


91.11 


93.33 


93.33 


8-CI-adenosine 


51.11 


55.56 


55.56 


55.56 


64.44 


93.33 


86.67 


92.89 


Carboplatin 


71.11 


73.33 


73.33 


62.22 


71.11 


86.67 


80.00 


88.00 


Chloroquine 


70.45 


65.91 


65.91 


54.55 


70.45 


97.73 


88.64 


95.91 


Cisplatin 


79.07 


65.11 


65.11 


58.14 


81.40 


90.70 


93.02 


91.63 


Cyclopamine 


28.89 


40.00 


17.78 


51.11 


42.22 


84.44 


80.00 


86.67 


Diazonamide 


80.49 


80.49 


80.49 


60.98 


70.74 


92.68 


90.24 


90.73 


Docetaxel 


90.24 


90.24 


90.24 


78.05 


90.24 


100 


100 


100 


Doxorubicin 


41.30 


56.52 


56.52 


43.48 


58.70 


89.13 


76.09 


88.70 


Erlotinib 


86.05 


86.05 


86.05 


88.37 


90.70 


88.37 


97.67 


88.37 


Etoposide 


55.81 


62.79 


62.79 


53.49 


65.12 


95.35 


90.70 


94.88 


Gefitinib 


90.00 


90.00 


90.00 


90.00 


90.00 


95.00 


65.00 


95.00 


Gemcitabine 


81.81 


81.81 


81.81 


61.36 


77.27 


100 


100 


100 


Gemcitabine/Cisplatin 


73.81 


71.43 


71.43 


61.90 


66.67 


95.24 


65.24 


91.43 


Irinotecan 


47.50 


55.00 


55.00 


50.00 


40.00 


92.50 


90.00 


92.50 


Orexin 


83.33 


83.33 


83.33 


77.78 


83.33 


100 


100 


100 


Paclitaxel 


85.11 


85.11 


85.11 


61.70 


85.11 


100 


93.62 


100 


Paclitaxel/Carboplatin 


90.20 


90.20 


90.20 


82.35 


90.20 


98.04 


98.04 


98.04 


Peloruside A 


80.95 


80.95 


80.95 


66.67 


80.95 


92.86 


92.86 


95.24 


Pemetrexed 


59.09 


52.27 


52.27 


68.18 


65.91 


93.18 


81.82 


93.18 


Pemetrexed/Cisplatin 


61.90 


61.90 


61.90 


57.14 


47.62 


83.33 


85.71 


90.00 


Smac Mimetic 


84.62 


84.62 


84.62 


66.67 


82.05 


97.44 


92.31 


97.44 


Sorafenib 


87.23 


87.23 


87.23 


78.72 


85.11 


97.87 


91.49 


97.87 


Vinorelbine 


79.07 


79.07 


79.07 


51.16 


76.74 


90.70 


93.02 


90.70 


Average 


72.00 


72.83 


71.90 


64.61 


72.15 


93.57 


91.06 


93.85 
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implemented in Weka [15]. To select proteins (features) 
in NBC, TAN, and ANBC, we used Mutual Information 
between attribute and class variable. The number of fea- 
tures to be selected is predefined as 10, 20, and 30. 

Experimental results 

Table 2 shows the classification accuracy of each classifica- 
tion method for 24 drugs in 20 selected features (The 
results in 10 and 30 features are in the additional file 1). 
Over all, ANBC outperformed support vector machine 
classification with three Different kernels, logistic regres- 
sion, and random forest algorithm in all feature sets (10, 
20, and 30 features). ANBC outperforms NBC in 10 and 



20 selected features but not 30 features. Surprisingly NBC 
performed better than TAN which has developed to solve 
the limitation of independence assumption in NBC. The 
reason for this might be the small sample size of our data 
(43 per drug on average) as it is shown that NBC can out- 
perform the discriminatively trained model for small sam- 
ple data sets in the empirical results of [16] and it is true 
that the number of samples should be sufficient for condi- 
tional probability (likelihood in the classifier form) to 
represent the data. In Table 2, ANBC achieved 100% accu- 
racy in four drugs, Docetaxel, Gemcitabine, Orexin, and 
Paclitaxel. Logistic regression shows the lowest accuracy, 
64.61% on average, and SVM with Radial basis function 



r 




20 40 60 80 100 



SVML 

(a) SVML vs. ANBC 




SVMP 



(b) SVMP vs. ANBC 




20 40 60 80 100 

SVMR 



(c) SVMR vs. ANBC 





(g) TAN vs. ANBC 

Figure 3 Scatter plots of the accuracy of the proposed method vs. state-of-the-art classifiers 
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kernel has the lowest accuracy, 17.78% in Cyclopamine. 
The scatter plot (Figure 3) is for comparison of two algo- 
rithms. Each point represents a data set (24 drugs) where 
the y and x coordinate of a point is the accuracy rate 
according to ANBC and counterpart respectively. The red 
points above the diagonal line represent the drug whose 
sensitivity is predicted better in ANBC (vertical axis) than 
counterpart (horizontal axis). In Figure 3(f), 6 red points 
are relatively far from the diagonal line while NBC has bet- 
ter accuracy in 3 drugs (blue points). ANBC also has bet- 
ter accuracy than TAN in most of the drugs except four 
drugs (Figure 3(g)). Figure 4 shows the accuracy of each 
classifier using Different feature sets. The performance of 
each method is similar to Table 2. ANBC, NBC, and TAN 
outperform other methods in all three feature sets. In 
ANBC and NBC, the prediction accuracy slightly increases 
when they have larger number of features while the per- 
formance of TAN and SVM is independent of the number 
of features. In LR and RF, the accuracy is decreased with 
more features. The results imply that Bayesian network 
based classifiers (ANBC, NBC, and TAN) can work more 
effectively than other methods in RPPA and drug sensitiv- 
ities, and it is confirmed that the classification for the drug 
sensitivity prediction with RPPA can be potentially 
improved by effectively using the dependency of proteins. 
However, the result of TAN implies that too many aug- 
mented edges may decrease the accuracy in small sample 
size data. 

Conclusion 

In this paper, we introduce the personalized medicine 
with RPPA and drug sensitivity. The goal of persona- 
lized medicine is to provide the optimal therapy to 



patients who have Different biological profile regarding 
the target cancer. For this goal, Bayesian network classi- 
fier is applied for the drug sensitivity prediction given 
patient's RPPA. We propose a new score function LCR 
for learning discriminative structure of Bayesian net- 
work classifier. All augmented edges are scored by LCR 
that is based on the difference between CR before and 
after a single edge is augmented. In other words, the 
score represents how the edge augmented in NBC is 
likely to increase the classification rate in ANBC. Based 
on the scored edges, the discriminative structure is dis- 
covered through Hill-Climbing search. Since it is known 
that NBC normally outperforms discriminative learning 
algorithm for small sized sample data (In our data the 
number of samples on average is 43), we focus on the 
idea that is to augment only a least number of edges to 
improve the performance mostly maintaining the advan- 
tage of NBC structure while TAN augments too many 
edges in NBC. In the experiments, ANBC with pro- 
posed score function is compared to well-known classi- 
fication algorithms such as Support vector machine, 
Logistic regression, and Random forest. We also com- 
pare to Bayesian network classifiers, TAN and NBC 
with generative parameters. The results show that the 
ANBC outperforms other classification algorithms and 
achieves slightly better accuracy than NBC in small 
sized sample data sup-porting the claim that the depen- 
dency of proteins can be used to improve the sensitivity 
prediction for the personalized medicine. To overcome 
the limitation of sample size, we plan to investigate 
more about discriminative parameter learning and effec- 
tive feature selection for Bayesian network classifier as 
future works. 
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Additional material 



Additional file 1: Accuracy of sensitivity prediction for 24 drugs 
with 10 and 30 selected features. The file includes two tables for 
classification accuracy in 10 and 30 selected features. 
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